ReSense: A Unified Framework for Improving Performance and Reliability in Multicore Architectures

ثبت نشده
چکیده

Chip-multiprocessors (CMPs) have become ubiquitous in modern computing and the mainstream architecture for various platforms, including laptops, desktops, and large server machines. As technology scaling continues and more transistors are accommodated on the chip, the number of cores on CMPs is growing, and multi-core machines are scaling up to many-core machines. With this multi-core scaling, two major problems arise: shared-resource contention and soft errors or transient faults. Shared-resource contention can degrade an application’s performance significantly, and soft errors increase the probability of incorrect application execution and the production of visible errors. To realize the full potential of multiand many-core platforms, it is critical to ensure that applications in a workload not only execute e ciently and fast, but also correctly. In this dissertation, we develop a novel, general, and unified framework, ReSense, to address several challenges on multicore architectures including performance optimization, reliability improvement, power and thermal management. The framework includes five components: a general characterization methodology, a characterization metric, a sensitivity score, a thread mapping algorithm, and a run-time system. An instance of the framework is applied in two phases: characterization and mapping. The characterization phase utilizes the general characterization methodology and characterization metric to identify application characteristics without considering any co-runner(s). It generates a resource-sensitivity score for each application in a workload. In the mapping phase, the run-time system uses a thread-mapping algorithm and the sensitivity scores of the applications in a workload to c Abstract d determine the thread-mappings that optimize the objective function of the targeted problem. To demonstrate the utility and e ectiveness of ReSense, we use it to address the problems of shared-resource contention and soft errors for multi-threaded applications. For the resource contention problem, the characterization methodology determines how a multi-threaded application’s performance is a ected as it shares a resource in the memory hierarchy. A sensitivity score based on resource contention is produced for each application in a workload. The run-time system uses the resource-contention sensitivity scores and a thread-mapping algorithm to allocate threads from a workload to core to mitigate shared-resource contention, thus improving response time and throughput. For the soft error problem, the characterization methodology determines how a multithreaded application’s vulnerability to soft errors in shared caches is a ected by its resource occupancy duration. A sensitivity score based on cache occupancy is produced for each application in a workload. The run-time system uses the cache-occupancy sensitivity scores and a thread-mapping algorithm to allocate workload threads to cores to reduce the occupancy in the shared caches, thus reducing cache vulnerability. Both minimizing an application’s vulnerability to soft errors and maintaining application performance are critical. The thread-mapping algorithm that ensures better reliability may not ensure better performance. To address this problem, we develop an integrated instance of the framework that combines application characterizations for both contention and vulnerability to determine a trade-o between the performance and reliability improvements. The dissertation includes a comprehensive evaluation of all three instances, which indicates that the mapping of each application in a dynamic workload according to its solocharacterization is highly e ective. For the resource contention instance, response time and throughput was improved up to 30% and 47%, respectively over the native operating system. For the soft error instance, cache vulnerability was reduced up to 70% over the native operating system. The integrated instance was able to achieve various trade-o s between response time and vulnerability reductions.d determine the thread-mappings that optimize the objective function of the targeted problem. To demonstrate the utility and e ectiveness of ReSense, we use it to address the problems of shared-resource contention and soft errors for multi-threaded applications. For the resource contention problem, the characterization methodology determines how a multi-threaded application’s performance is a ected as it shares a resource in the memory hierarchy. A sensitivity score based on resource contention is produced for each application in a workload. The run-time system uses the resource-contention sensitivity scores and a thread-mapping algorithm to allocate threads from a workload to core to mitigate shared-resource contention, thus improving response time and throughput. For the soft error problem, the characterization methodology determines how a multithreaded application’s vulnerability to soft errors in shared caches is a ected by its resource occupancy duration. A sensitivity score based on cache occupancy is produced for each application in a workload. The run-time system uses the cache-occupancy sensitivity scores and a thread-mapping algorithm to allocate workload threads to cores to reduce the occupancy in the shared caches, thus reducing cache vulnerability. Both minimizing an application’s vulnerability to soft errors and maintaining application performance are critical. The thread-mapping algorithm that ensures better reliability may not ensure better performance. To address this problem, we develop an integrated instance of the framework that combines application characterizations for both contention and vulnerability to determine a trade-o between the performance and reliability improvements. The dissertation includes a comprehensive evaluation of all three instances, which indicates that the mapping of each application in a dynamic workload according to its solocharacterization is highly e ective. For the resource contention instance, response time and throughput was improved up to 30% and 47%, respectively over the native operating system. For the soft error instance, cache vulnerability was reduced up to 70% over the native operating system. The integrated instance was able to achieve various trade-o s between response time and vulnerability reductions. Acknowledgments This dissertation would not have been possible without the help and support from many people in my life. First and foremost, I acknowledge my PhD advisors, Mary Lou So a and Jack Davidson. Over the past six years, they have taught me how to think critically, write clearly, express ideas, give good talks, and most importantly how to do good research. They have mentored me, supported me both professionally and personally, whenever I went through any tough time in my life. They have always been patient with me whenever I struggled to find my research direction and with the countless draft of the papers I wrote with them. I cannot thank them enough and am very grateful for what they did for me. They truly have been my academic parents. I would like to acknowledge the members of my dissertation committee, Sudhanva Gurumurthi, Mary Jane Irwin, and John Lach. They have given me feedback on my dissertation proposal to make the work better. Especially, I would like to thank Sudhanva. Whenever I wanted any advice from him about the reliability work, he always made time from his busy schedule to discuss with me. I thank my lab and research mate, Wei Wang, for his support, feedback and honest opinion about my work. I acknowledge Jason Mars, who has been a good mentor to me. In the initial years of my PhD, both Jason and Lingjia had long discussions with me about my research and shared their thoughts about how to write a good paper. I also thank Kristen Walcott and Jing Yang for their feedback on my early ideas of the dissertation proposal. e Acknowledgments f I thank the systems sta , especially Scott Ru ner and Essex Scales, for their help and assistance whenever there was any problem with the machines and servers. They have always tried to accommodate any request I had. I also thank the CS department sta for keeping me on track in terms of o cial paper work. I am grateful to all my teachers in Bangladesh, from the elementary to undergraduate school. They all have contributed to my intellectual ability, starting from how to get my hand writing (which I hardly get to do anymore) better to in-depth and broad knowledge in computer science. I acknowledge all my friends and acquaintances at UVA and in Charlottesville. I am grateful to Taniya Siddiqua for being such a great mentor and sister to me. She has inspired me in numerous ways and gave me the courage and strength to survive through the hard graduate life. My life in Charlottesville would not have been the same without so many friends here, including my “bachcha-party”: Yamina, Anindya, Emi, Asif, Samee. Special thanks to Juhi for keeping me active and full of spirit during the last six months and the most stressful time of my graduate life. I acknowledge Charlottesville for the wonderful six years. I have loved every moment I have lived here, including the beauty of the Blue Ridge Mountains, the beautiful fall, every snowstorm in the winter, and the soothing rain in the summer. This is the place where two of my dreams came true. Charlottesville will always be my second home and remain special to me. I am very blessed to have many dear friends in my life. Some have been friends with me for almost 20 years: Shemul (Mollick), Setu, Tithi. I acknowledge Laboni and Shantonu, for listening to my endless complaints about everything that goes wrong in my life, including paper rejections. I acknowledge Chayan, Sagar, Shafi, and Nabila for keeping my spirit high whenever I felt low. I am grateful for their continuous support and unconditional friendship, which gave me enough strength in my graduate life. I acknowledge my family. Didi and Dida have been a constant source of encouragement. Acknowledgments g Pishi taught me how to think positively in life, which helped me during my graduate life. I acknowledge all Kaka-Kakimas, Mama-Mamis, my cousins for supporting me during the hard times in my life. I acknowledge my family here in USA, MonDidi and family, SamarDada and family, Kumar, Ann, and Kent for making a foreign country feel like home. I acknowledge my Baba for inspiring me to overcome the di culties and frustration in life. I acknowledge my husband, Enamul: my best friend, harshest critic, biggest admirer, partner in crime, and co-pilot in life. Ever since I met him, he has been with me through every up and down of my life. He has made me a better version of what I am today. He has encouraged me, supported me, made me believe in my abilities and myself. I am more confident when I have him by my side. I can’t wait to start the next phase of our lives together. Last but not the least, I acknowledge my mother, Purabi Dey and Dadu, Anil Kumar Dey. I can’t express in words their significance in my life. When I was in Bangladesh, Dadu used to stop by my study room every time I had any exam. I really wish he were alive to see me past the PhD finish line, where he always wanted me to be. Whenever I got frustrated and felt like giving up my PhD work, the biggest force that kept me moving was the thought of Maa and all the sacrifices she made for me in her life. I dedicate my dissertation to her.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems

Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...

متن کامل

Reliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)

Nowadays, faults and failures are increasing especially in complex systems such as Network-on-Chip (NoC) based Systems-on-a-Chip due to the increasing susceptibility and decreasing feature sizes. On the other hand, fault-tolerant routing algorithms have an evident effect on tolerating permanent faults and improving the reliability of a Network-on-Chip based system. This paper presents reliabili...

متن کامل

Evaluating multicore algorithms on the unified memory model

One of the challenges to achieving good performance on multicore architectures is the effective utilization of the underlying memory hierarchy. While this is an issue for single-core architectures, it is a critical problem for multicore chips. In this paper, we formulate the unified multicore model (UMM) to help understand the fundamental limits on cache performance on these architectures. The ...

متن کامل

Burst Mode Processing: An Architectural Framework for Improving Performance in Future Chip MultiProcessors

A new family of chip-level multiprocessor architectures called Bright Core Multicore Processor (BCMP) is presented, in which individual cores can operate either in normal mode at nominal operating clock frequency or in burst mode (at frequencies in the range of 15 GHz or more). BCMP architectures represent a class of temporally overprovisioned computing systems that allow trade-o↵s between late...

متن کامل

Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures

The current trend to multicore architectures underscores the need of parallelism. While new languages and alternatives for supporting more efficiently these systems are proposed, MPI faces this new challenge. Therefore, up-to-date performance evaluations of current options for programming multicore systems are needed. This paper evaluates MPI performance against Unified Parallel C (UPC) and Ope...

متن کامل

UNIVERSITY OF CALIFORNIA RIVERSIDE IMPRESS: Improving Multicore Performance and Reliability via Efficient Support for Software Monitoring

OF THE DISSERTATION IMPRESS: Improving Multicore Performance and Reliability via Efficient Support for Software Monitoring

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014